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Abstract 

The modular decomposition is a technique that applies but is not restricted to graphs. 
The notion of module naturally appears in the proofs of many graph theoretical theorems. 
Computing the modular decomposition tree is an important preprocessing step to solve a large 
number of combinatorial optimization problems. Since the first polynomial time algorithm in the 
early 70 's, the algorithmic of the modular decomposition has known an important development. 
This paper survey the ideas and techniques that arose from this line of research. 

1 Introduction 

Modular decomposition is a technique at the crossroads of several domains of combinatorics which 
applies to many discrete structures such as graphs, 2-structures, hypergraphs, set systems and ma- 
troids among others. As a graph decomposition technique it has been introduced by Gallai [Gal67 
to study the structure of comparability graphs (those graphs whose edge set can be transitively 
oriented). Roughly speaking a module in graph is a subset M of vertices which share the same 
neighbourhood outside M. Galai showed that the family of modules of an undirected graph can be 
represented by a tree, the modular decomposition tree. The notion of module appeared in the lit- 
terature as closed sets |Gal67] . clan [EGMS94!, automonous sets |M6h85b , clumps |Bla78j. . .while 
the modular decomposition is also called substitution decomposition |M6h85a or X-join decompo- 
sition [HM79 . Sec [MR84] for an early survey on this topic. 

There is a large variety of combinatorial applications of modular decomposition. Modules can 
help proving structural results on graphs as Galai did for comparability graphs. More generally 
modular decomposition appears in (but is not limited to) the context of perfect graph theory. Indeed 
Lovasz's proof of the perfect graph theorem |Lov72] involves cliques modules. Notice also that a 
number of perfect graph classes can be characterized by properties of their modular decomposition 
tree: cographs, i-4-sparse graphs, permutation graphs, interval graphs. . . Refer to the books of 
Golumbic [G0I8O , Brandstadt et al. [BLS99 for graph classes. We should also mention that the 
modular decomposition tree is useful to solve optimization problems on graphs or other discrete 
structures (see [M6h85b]). An example of such use is given in the last section. 
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In the late 70 's, the modular decomposition has been independently generalized to partitive set 
families [CHM81 j and to a combinatorial decomposition theory |CE80] which applies to graphs, 
matroids and hypergraphs. More recently, the theory of partitive families and its variants had 
been the foundation of decomposition schemes for various discrete structures among which 2- 
structures |EHR99] and permutations |UY0CH BCdMR08j . Beside, based on efficiently representable 
set families, different graph decompositions had been proposed. The split decomposition of [CE80 
relies on a bipartitive family on the vertex set. Refer to |BX08] for a survey on the recent develop- 
ments of these techniques. 

A good feature of most of these decomposition schemes is that they can be computed in poly- 
nomial time. Indeed, since the early 70's, there have been a number algorithms for computing the 
modular decomposition of a graph (or for some variants of this problem). The first polynomial algo- 
rithm is due to Cowan, James and Stanton [CJS72J and runs in 0(n 4 ). Successive improvements are 
due to Habib and Maurer [HM79j who proposed a cubic time algorithm, and to Miiller and Spinrad 
who designed a quadratic time algorithm. The first two linear time algorithms appeared indepen- 
dently in 1994 [CH94, MS94j. Since then a series of simplified algorithms has been published, some 
running in linear time |MS991 [TCHP08] . others in almost linear time jDGMOll iMSOOl IHPV99] . The 
list is not exhaustive. This line of research yields a series of new interesting algorithmic techniques, 
which we believe, could be useful in other applications or topics of computer science. The aim of 
this paper is to survey the algorithmic theory of modular decomposition. 

The paper is organized as follows. The partitive family theory and its application to modular 
decomposition of graphs is presented in Section 2. As an algorithmic appetizer, Section 3 addresses 
the special case of totally decomposable graphs, namely the cographs, for which a linear time 
algorithm is known since 1985 [CPS85 . Partition refinement is an algorithmic technique that 
reveals to be really powerful for the modular decomposition problem, but also for other graphs 
applications (see e.g. [PT87"1 |HPV99| ). Section 4 is devoted to partition refinement. Section 5 
describes the principle of a series of modular decomposition algorithms developped in the mid 
90's. Section 6 explains how the modular decomposition can be efficiently computed via the recent 
concept of factoring permutation [CHdM02j. Let us mention that we do not discuss the recent 
linear time algorithm of Tedder et al. [TCHP08], even though we believe that this last algorithm 
provides a positive answer to the problem of finding a simple linear time modular decomposition 
algorithm. Actually the key to Tedder et al.'s algorithm is to merge the ideas developed in Sections 
5 and 6. The purpose of this paper is not to enter into the details of all the algorithm techniques but 
rather to present their main lines. Finally the last section presents three recent applications of the 
modular decomposition in three different domains of computer science, namely pattern matching, 
computational biology and parameterized complexity. 

2 Partitive families 

The modular decomposition theory has to be understood as a special case of the theory of partitive 
family whose study dates back to the early 80's |CE801 1CHM81] . We briefly present the mains 
concepts and theorems of the partitive family theory. We then introduce the modular decomposition 
of graphs and discuss its elementary algorithmic aspects. This section ends with a discussion on two 
important class of graphs: indecomposable graphs (the prime graphs) and totally decomposable 
graphs (known as the cographs) 
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2.1 Decomposition theorem of partitive families 



The symmetric difference between two sets A and B is denoted by A A B = [A \ B) U (B \ A). 
Two subsets A and B of a set S overlap if A n S / 0, A\B ^% and £? \ A / 0, we write 

Definition 1 ^4 family S Q2 S of subsets of S is partitive if: 

1. S £S,® (£S and for all x £ S, {x} 6 S; 

2. For any pair of subsets A,B £ S such that A _L B: 

(a) AC) B <E S; 

(b) A\B eS andB\AeS; 

(c) AUB £ S; 

(d) A A B £ S. 

A family is weakly partitive whenever condition (2.d) is not satisfied. Unless explicitly men- 
tioned, we will only consider partitive families. 

Definition 2 An element F £ S is strong if it does not overlap any other element of S. The set 
of strong elements of S is denoted Sf- 

Obviously any trivial subset of S, namely S or {x} (for x £ S), is a strong element. Let us 
remark that Sf is nested, i.e. the transitive reduction of the inclusion order of Sf is a tree T$, 
which we call the strong element tree (see Figure [I]). It follows that \Sf\ = 0(\S\). 




Figure 1: The inclusion tree of the strong elements of the family 

S = {{1, 2,3,4,5,6, 7,8}, {1,2, 3}, {6, 7, 8}, {1,2}, {2,3}, {1,3}, {6, 7},{7,8}, {6,8}, 

{1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}} 

Definition 3 Let ff, . . . , /? be the chidren of a node q of Ts, the strong element tree of S. The 
node q is degenerate if for all non-empty subset J C [l,k], Uj 6 j/J G S. A node is prime if for 
every non-empty subset J C [1, k], Uj e jfJ ^ S. 

It is not difficult to see that any strong element is either prime or degenerate. Moreover the 
following theorem tells us that the tree Ts is a representation of the family S and the subfamily of 
strong elements Sf of S defines a "basis" of 5. 

Theorem 1 jCHM81^ Let S be a partitive family on S. The subset A C S belongs to S if and only 
if A is strong or there exists a degenerate strong element A' (or a node of Ts ) such that A is the 
union of a strict subset of the children of A' in Ts- 
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As a consequence, even if a partitive family on a set S can have exponentially many elements, it 
always admits a representation linear in the size of S. Such a representation property is also known 
for other families of subsets of a set, such as laminar families, cross- free families |EG97j . . . as well as 
for some families of bipartitions of a set, such as splits [CE80]. Recently, a similar result has been 
shown for union- difference families of subsets of a set, i.e. families closed under the union and the 
difference of its overlapping elements |BXH08| . In this latter case, the size of the representation 
amounts to 0(|5| 2 ). For a detailed study of these aspects, the reader should refer to |BX08 . 



2.2 Factoring Permutations 

Although the idea of factoring permutation implicitly appeared in some early papers (see e.g. (HM91, 
IHsu92| IHHS95] ) , it has only been formalized in |CH97[ |Cap97| . This concept turns out to be central 
to recent modular decomposition algorithms and other applications. 

Let a be a permutation of a set S of size n. By o~(x), we mean the rank i of x in a and a~ l {i) 
stands for the i-th element of a. A subset I C S is a factor or an interval of a permutation a if 
there exist i G [1, n] and j G [1, n] such that I = {x | x = a^ 1 (k), i ^ k ^ j}. In other words, the 
elements of I occur consecutively in a. 

Definition 4 I Cap 9^ Let S be a (weakly) partitive family of a set S and let Sf be the strong 
elements of S. A permutation a of S is factoring for S if for any F G Sf, F is a factor of a. 

For example, 7r = 1234567 8, n\ = 67843125 and TT2 = 87613245 are three factoring 
permutations of the family S depicted in Figure [T] One can check that, in each of these three 
permutations, the two non-trivial strong elements of Sf, namely {1, 2, 3} G Sf and {6, 7, 8} G Sf, 
are factors. 

Given a layout of the strong element tree of a partitive family, a left-to-right enumeration of 
the leaves results in a factoring permutation. In many cases it is easier to compute a factoring 



permutation than the strong element tree. We explain in Section 6.3 how to obtain the strong 
element tree from a factoring permutation. 

To conclude this brief introduction on factorizing permutation, we state a Lemma which formal- 
izes links between intervals of factoring permuations and partitive families. This Lemma somehow 
guided the development of factoring permutation algorithms. 

Lemma 1 Let a be a factoring permutation of a partitive family S. Then the set I{S,a) of 
intervals of a which are elements of S is a weakly partitive family. Moreover the strong elements 
ofI{S,cr) and of S are the same. 



2.3 Modules of a graph 

For the sake of the presentation we only consider undirected, simple and loopless graphs. We use 
the classical notations {e.g. see |BLS99j ). The neighbourhood of a vertex x in a graph G = (V, E) is 
denoted Nq{x) and its non-neighbourhood Nq{x) (subscript G will be omitted when the context is 
clear). The complementary graph of a graph G is denoted by G. Given a subset of vertices icy, 
G[X] is the subgraph induced by X (any edge in G between two vertices in X belongs to 

Let M be a set of vertices of a graph G = (Vj E) and x be a vertex of V \ M. Vertex x splits 
M (or is a splitter of M), if there exist y G M and z G M such that xy G E and xz ^ E. If x is 
not a splitter of M, then M is uniform or homogeneous with respect to x. 
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Definition 5 Let G = (V, E) be a graph. A set MOV of vertices is a module if M is homogeneous 
with respect to any x £ M (i.e. M C N(x) or M n N(x) =9). 

Observation 1 Let S be a subset of vertices of a graph G = (V, E). If S has a splitter x, then any 
module of G containing S also contains x. 

Aside the singletons and the whole vertex sets, any union of connected components (or of 
co-connected components) of a graph are simple examples of modules. Let us also note that 
a graph may have exponentially many modules. Indeed any subset of a complete graph is a 
clique. Nevertheless, as we shall see with the following lemma, the family of modules has strong 
combinatorial properties. 

Lemma 2 [CHM81] The family M of modules of a graph is partitive. 

The notions of trivial and strong module and degenerate are defined according to the terminol- 
ogy of Section 2.1 By Lemma [2j if M and M' are overlapping modules, then M \ M', M' \ M, 



M n M', MUM' and M A M' are modules of G. 

Let M and M' be disjoint sets. We say that M and M' are adjacent if any vertex of M is 
adjacent to all the vertices of M' and non- adjacent if the vertices of M are non- adjacent to the 
vertices of M' . 

Observation 2 Two disjoint modules are either adjacent or non-adjacent. 

A module M is maximal with respect to a set S of vertices, if M C S and there is no module 
M' such that M C M' C S. If the set S is not specified, we shall assume S = V. 

Definition 6 Let V = {Mi, . . . , M^} be a partition of the vertex set of a graph G = (V, E). If for 
all i, 1 ^ i ^ k, Mi is a module of G, then V is a modular partition (or congruence partition) of 
G. 

A non-trivial modular partition V = {Mi, . . . , M^} which only contains maximal strong modules 
is a maximal modular partition. Notice that each graph has a unique maximal modular partition. If 
G (resp. G) is not connected then its (resp. co-connected) connected components are the elements 
of the maximal modular partition. From Observation [2] we can define a quotient graph whose 
vertices are the parts (or modules) belonging to the modular partition V . 

Definition 7 To a modular partition V = {Mi, . . . ,M^\ of a graph G = (V,E), we associate a 
quotient graph Gm, whose vertices are in one-to-one correspondence with the parts of V . Two 
vertices Vi and vj of G /-p are adjacent if and only if the corresponding modules Mi and Mj are 
adjacent in G. 

Let us remark that the quotient graph G i<p with V = {Mi, . . . , M^} is isomorphic to any 
subgraph induced by a set V 1 C V such that Vi G [1, k], |Mj D V'\ = 1. The representative graph of 
a module M is the quotient graph G[M]/-p where V is the maximal modular partition of G[M]: it 
is thereby the subgraph induced by a set containing a unique - representative - vertex per maximal 
strong module of G[M]. See Figure [2j By extension, for a module M, we denote by G/m the graph 
quotiented by the modular partition {M} U {{x} \ x £ M}. 

Before we state the modular decomposition theorem (Theorem[2]), let us present two more proper- 
ties of modular partitions and quotient graphs which are central to efficient modular decomposition 
algorithms (see Section [5]). 
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Figure 2: On the left, the grey sets are modules of the graph G. Q = 
{{1}, {2, 3}, {4}, {5}, {6, 7}, {9}, {8, 10, 11}} is a modular partition of G. The quotient graph 
G/q, depicted on the right with a representative vertex for each module of Q, has two non- 
trivial modules (the sets {3,4} and {9,10}). The maximal modular partition of G is V = 
{{1}, {2, 3, 4}, {5}, {6, 7}, {8, 9, 10, 11}} and its quotient graph are represented in Figure [3] (aside 
the top node of the tree) . 



Lemma 3 [M6h85b] Let V be a modular partition of a graph G = (V, E). Then X QV is a module 
of G /<p iff UmeA" M is a module of G. 

Lemma[3]is illustrated on Figure[2j for example, the set {2, 3, 4} is a module of G, it is the union 
of modules {2,3} and {4} (which representative vertices are respectively 3 and 4 in G/q) which 
belongs to partition Q. It can be strengthened in order to observe the correspondance between the 
strong modules of G and those of G m . 

Lemma 4 Let V be a modular partition of a graph G = (V, E). Then X C V is a non-trivial 
strong module of G m iff \_} M ^ X M is a non trivial strong module ofG. 

The inclusion tree of the strong modules of G, denoted MD(G), entirely represents the graph 
if the representative graph of each strong module is attached to each of its nodes (see Figure [3J. 
Indeed any adjacency of G can be retrieved from MD{G). Let x and y be two vertices of G and 
let Gn be the representative graph of node N, their least common ancestor. Then x and y are 
adjacent in G if and only if their representative vertices in Gjv are adjacent. 




1 23 45 67 89 10 11 



Figure 3: The inclusion tree MD{G) of the strong modules of G. The representative graph as- 
sociated to the root is G m with V = {{1}, {2, 3, 4}, {5}, {6, 7}, {8, 9, 10, 11}}, the parts of which 
correspond to the children of the root. 



Let us recall that a graph is prime if it only contains trivial modules. 

Theorem 2 (Modular decomposition theorem) \Gal61 , [CHM81f 

■ graph G = (V, E\ 
is not connected; 



For any graph G = (V, E), one of the following three conditions is satisfied: 
1. G ' 
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2. G is not connected; 

3. G and G are connected and the quotient graph G m, with V the maximal modular partition 
of G, is a prime graph. 

What does the modular decomposition theorem say is twofold. First, the quotient graphs 
associated with the nodes of the inclusion tree MD{G) of the strong modules are of three types: 
an independent set if G is not connected (the node is labelled parallel); a clique (complete graph) 
if G is not connected (the node is labelled series); a prime graph otherwise. It also follows that 
MD{G) is unique and does not contain two consecutive series nodes nor two consecutive parallel 
nodes. Parallel and series nodes of MD{G) are also called degenerate nodes. 

The tree MD(G) is called the modular decomposition tree. Theorem [2] yields a natural poly- 
nomial time recursive algorithm to compute MD(G): 1) compute the maximal modular partition 
V of G; 2) label the root node according to the parallel, series or prime type of G; 3) for each 
module M of V, compute MD{G[M\) and attach it to the root node. A subproblem central to 
the computation of MD{G) is to compute the maximal modular partition, a task which can be 
avoided if a non-trivial module M is identified. This yields another natural algorithm scheme: by 
Lemma [3] and Lemma |4j it suffices to recursively compute MD(G[M]) and MD(G/m), and then 
to paste M D{G[M\) on the leaf of MD{G /m) corresponding to the representative vertex of M. As 
suggested by Cowan et al. [CJS72] . a naive way to compute a non-trivial module is to follow the 
definition of module and Observation [T] Assume the graph G contains a non-trivial module M. 
Then M contains a pair of vertices {x, y} and as a module is closed under adding splitters. Such 
an algorithm would find a non-trivial module, if any, in time 0(n 2 (n + m)). We should note that 
for some generalizations of the modular decomposition, no better algorithm than this "closure by 
splitter" approach is known (sec e.g. |BXHLdM09 ). 

Before we present some structural properties of prime and totally decomposable graphs, let 
us introduce some notations and briefly discuss the composition view of the theory of modules in 
graphs. 

Notation 3 For a node p of MD{G), its corresponding strong module is denoted by M{p) (or P). 
In fact M{p) is the union of all singletons which are leaves of the subtree of M{p) rooted in p. 
The minimal strong module containing two vertices x and y is denoted by m(x, y), while the maximal 
strong module containing x but not y, for any two different vertices x, y of G, is denoted by M(x,y). 

The substitution operation is the reverse of the quotient operation. It consists of replacing a 
vertex x of G by a graph H = (V',E') while preserving the neighourhood. The resulting graph is: 

G X ^ H = {{V \ {x}) U V, (E \ {xy € E}) U E' U {yz : xy £ E et z £ V}) 

The parallel composition or disjoint union of k connected graphs G\,...Gk defines a graph 
whose connected components are the graphs G\ , . . . , G^ . This composition operation is usually 
denoted Gi®---®G k - 

The series composition of k co-connected graphs G\, . . . , G^ defines a graph whose co-connected 
components are the graphs G%, . . . , Gk (for any pair x, y of vertices belonging to different graphs G% 
and Gj, the edge xy has been added). The series composition is generally denoted G\ (g) • • • (8) Gk- 

These three operations are classical graph operations that have been widely used in various 
contexts among which the clique- width theory JCER93 . 
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2.4 Prime graphs 



The structure of prime graphs has been extensively studied (e.g. see [ER90[ IST931 ICI98] ). For 
example, it is easy to check that the smallest prime graph is the P4, the path on 4 vertices (see 
Figure [1]). As witnessed by the following result, Pi's play an important role in the structure of 
prime graphs. 

Lemma 5 JCI98f Let G, with \G\ > 4, be a prime graph. Then any vertex, but at most one, is 
contained in an induced P4. A vertex not contained in any P4 is called the "nose of the bull" (see 
Figure [^J). 

• • • • 

abed 

x 

Figure 4: The vertices o, b, c, d form a P4 whose extremities are a and d, and midpoints b and c. 
The graph on the right is the bull whose "nose" is vertex x. 

The next property shows that one can always remove one or two vertices from a large enough 
prime graph to obtain a new prime graph. 

Lemma 6 [ER90, ST931 Let G = (V, E) be a prime graph with at least 5 vertices. Then there 
exists a subset of vertices X such that \V\ — 2 ^ \X\ ^ \V\ — 1 and G[X] is prime. 

Jamison and Olariu proposed an extension of Theorem [2] by considering the structure of prime 
graphs [J095]. A subset C of vertices of a graph G = (V, E) is P -connected if for any bipartition 
{^4, B} of C, there is an induced P4 intersecting both A and B. For example the bull is not P- 
connected (consider the vertex partition {{x}, {a, b, c, d}}). A P-connected component is a maximal 
P-connected set of vertices. The set of P-connected components defines a partition of the vertices. 
A P-connected component H is separable if there is a bipartition (Hi,H%) of H such that for any 
P4 intersecting Hi and H2, the extremities are in H\ and the mid- vertices in H2. 

Theorem 4 [3095] Let G = (V, E) be a connected graph such that G is connected, then G is either 
P-connected or there exists a unique P-connected component H which is separable in (Hi, H2) such 
that for any vertex x ^ H , Hi C N(x) and H2 D N(x) = 0. 

A hierarchy of graph families have been proposed based on the above Theorem [4] by restricting 
the number of induced P^'s in small subgraphs (or equivalently by restricting the structure of prime 
graphs). For example, P4-sparse graphs are defined as the graphs for which there is at most one P4 
in any induced subgraph on 5 vertices [J092a, J092b]. Let us also mention the the P^-reducible 
graphs [J095 . See [BLS99I for a complete presentation of these graph families. 

2.5 Totally decomposable graphs 

A graph is totally decomposable if any induced subgraph of size at least 4 has a non-trivial module. 
As any prime graph contains a P4, it follows from Theorem [T] that any node of the modular 
decomposition tree MD(G) of a totally decomposable graph G is degenerate. 
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The family T of totally decomposable graphs is natural and arose in many different contexts (see 
|Sum73l ICLSB81I ICPS85] for references) even recently (see [BBCP041 BRV07 ]) as any graph of T 
can be obtained by a sequence of disjoint and series compositions starting from single vertex graph. 
Let us remark that if G is totally decomposable then also is its complement. The family of totally 
decomposable graphs is also known as the cographs for complement reducible graphs [CLSB81, 
ISum73] . From definition, the cograph family is hereditary (any induced subgraph of a cograph is a 
cograph). It also has a very simple forbidden subgraph characterization. 

Theorem 5 lSum73\j The cographs are exactly the P^-free graphs. 




Figure 5: A cograph and its modular decomposition tree (also called cotree). 

The following lemma states classical properties of cographs whose proofs (left to the reader) are 
good exercises to understand the structure of cographs. 

Lemma 7 Let x, y and v be vertices of a cograph G = (V, E). 

1. If xv E E, yv E and xy E E, then m(v,y) C M[v,x) 

2. If xv E E, yv E E and xy ^ E, then M(v,x) = M(v,y) and m(v,x) = m(v,y) 

Using Theorem [5] one can propose a naive cograph recognition algorithm by searching for an 
induced -P4. But so far, most of the linear time cograph recognition algorithms construct the 
modular decomposition tree and exhibit a P4 in case of failure. 

The first linear time cograph recognition algorithm was proposed in 1985 by Corneil, Perl and 
Stewart [CPS85 . It incrementally constructs the modular decomposition tree, also called cotree 
when restricted to cographs, as long as the graph induced by the processed vertices is a cograph. 
Even if alternative recognition algorithms have recently been proposed |Dah95, HP05, BCH P03] . 
the seminal algorithm of [CPS85 is a corner stone in the algorithmic of the modular decomposition 
and turns out to have a large impact even for other decomposition technics (e.g. for the split 
decomposition [GP07]). We present Corneil et al's algorithm in Section [3j 

2.6 Bibliographic notes 

The seminal paper on modular decomposition of graphs is probably Gallai's one |Ga l67 on tran- 
sitive orientation. Up to our knowledge, the only survey paper is due to Mohring and Raderma- 
cher [MR84]. More recently, Ehrenfeucht, Harju and Rozenberg [EHR99 published a book on 
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the decomposition of 2-structures (a generalization of graphs) which presents the modular decom- 
position in a more general framework. In its PhD thesis |BX08) , Bui Xuan proposes a survey 
as well as original results on the representation of set families. Many graph families are well- 
structured with respect to the modular decomposition, e.g. comparability graphs, permutation 
graphs, cographs. . . For these aspects, the reader should refer to the books of Golumbic [Gol80j and 
more recently [BLS991 |Spi03| . The algorithmic aspects are particularly developed in |Gol8CH |Spi03| . 

We saw that the family of modules in a graph is partitive. If we move to directed graphs, 
then we obtain a weakly partitive family. The related decomposition of bipartite graph into bi- 
modules also yields a weakly parititive family |FHdMV04 . In order to formalize split decomposition 
[CE80. , bipartitive families have been introduced [CE80[ ICun82j . For a recent survey on all kind 
of variations on the modular decomposition, the reader should refer to [BX08 . 

3 Cographs recognition algorithms as an appetizer 

We first study in detail the Corneil, Pearl and Stewart's algorithm [CPS85 . If the input graph is 
a cograph, this vertex-incremental algorithm builds the cotree by adding the vertices one by one 
in an arbitrary order. Then, we sketch how the cotree of a cograph can be updated under edge 
modification, a result is due to Shamir and Sharan [SS04 . 

3.1 Adding a vertex to a cograph 

Consider the following subproblem: given a cograph G = (V,E) together with its cotree MD(G), 
a vertex x and a subset of vertices S <^V , test whether the graph G + (x, S) = (V U {x}, E U {xy \ 
y £ S}) is a cograph and if so ouput the cotree MD[G + x). Corneil et aVs [CPS85 showed that 
whether G + x is a cograph or not can be characterized by a labelling of the nodes of the cotree 
MD{G). A node p receives the label: empty, if the corresponding module M{p) does not intersect 
S; adjacent if M{p) C S; and mixed otherwise. Remark that by definition any child of a node 
labelled adjacent (resp. empty) is also labelled adjacent (resp. empty). 

Lemma 8 !CPS85^ Let G be a cograph, x a vertex of V and S C V . The graph G + (x, S) is a 
cograph iff 

1. either none of the nodes of the cotree MD{G) is mixed; 

2. or the set of mixed nodes induces a path n from the root of MD{G) to some node p and 

(a) the children of the series nodes of it different than p are all adjacent; 

(b ) the children of the parallel nodes of tt different than p are all empty. 

The main idea expressed by the conditions of Lemma [8] is that the modifications of the cotree 
implied by the insertion of vertex x are localized in the subtree of MD{G) rooted at node p. Indeed 
any module disjoint from M(p) is not affected by ac's insertion (the corresponding nodes are labelled 
empty or adjacent). In a sense, node p should be considered as the insertion node. The cotree 
updates only depend on node p {e.g. whether it is mixed or adjacent). An example is depicted in 
Figure [6j 

The algorithm first labels the cotree in a bottom-up manner. The leaves corresponding to 
vertices of S are labelled adjacent. A node labelled adjacent forwards a partial mark to its father. 
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(Series 




Figure 6: Insertion of the vertex x adjacent to S = {b, d, g, h}. Grey nodes are the adjacent labelled 
nodes and dashed nodes are the mixed nodes. The insertion node p is the bold series node (father 
of a, b, c, d). 



When a node have received a mark from each of its children, it is labelled adjacent. At the end of this 
process the empty node have never been searched, while the partially marked nodes corresponds, 
if G + x is a cograph, to the parallel nodes of the path ir from the insertion node to the root of 
MD(G). It is not difficult to see that the number of the marked nodes is linear in the size of S 
meaning that the labelling process runs in time 0(|<S|). Testing the condition of the above lemma 
can be done within the same complexity as well. 

Theorem 6 J CPS '8 ffl The family of cographs can be recognized in linear time. 
3.2 Edge modification algorithms for cographs 

Let us now turn to the edge modification problem which consists in updating the cotree of a cograph 
G under an edge insertion or deletion. Since the cotree of a cograph can be obtained from the cotree 
of its complement by flipping the parallel and the series nodes, deleting or inserting an edge in a 
cograph are equivalent problems. 

Lemma 9 \SS0$ Let x and y be two non-adjacent vertices of a cograph G = (V, E). Then G+xy = 
(V,EL){xy}) is a cograph iff x is a child ofm(x,y) and M(y,x) C N(x). 

Let us sketch the argument proof. As xy ^ E, the module m(x, y) is represented by a parallel 
node. Assume the conditions of Lemma [9] do not hold. Then the path in the cotree from m(x,y) 
to x (resp. y) contains a series nodes p x (resp. p y ) which is the least common ancestor of x (resp. 
y) and some leaf u x (resp. u y ). Then the vertices {u x ,x, y, u y } induces a P4 in the graph G + xy. 

It follows from Lemma|9]that as long as the modified graph remains a cograph, the modifications 
in the cotree are local and can be done in constant time. From results presented in this section, we 
otbain that: 

Theorem 7 ISS04\ I CPS85f There exists an algorithm maintaining the modular decomposition tree 
of a cograph which runs in time 0{d) per modification (edge or vertex insertion and deletion), where 
d is the number involved in the modification. 

Such an algorithm is known in the litterature as a fully- dynamic algorithm. 
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Figure 7: Update of the cotree to insert the edge xy in a cograph. The node m(x, y) is split into two 
parallel nodes, say p and q, one being the father of x, the another the father of the other children 
of m(x,y). Then leaf y is extracted from the cotree and attached to a new series node inserted 
between nodes p and q. 

3.3 Bibliographic notes 

In the late 80's, Miiller and Spinrad generalized Corneil et al's algorithm to the first quadratic 
modular decomposition algorithm of graphs |MS89]. Their algorithm is also incremental, but unlike 
in Corneil et al's algorithm, the whole graph has to be known at the beginning of the algorithm. 
This restriction is required for the sake of adjacency tests. 

Concerning the cograph recognition problem, new algorithms also appeared recently. Habib 
and Paul |HP05| proposed a partition refinement based algorithm (see Section [4| and Bretscher et 
al [BCHP08 discovered a simple Lexicographic Breadth First Search [RTL76 based algorithm. 

Aside the two cograph algorithmic results presented above, fully-dynamic algorithms have re- 
cently been proposed to maintain a representation based on the modular decomposition tree un- 
der vertex and edge modifications for various graph classes: permutation graphs [CP06 , interval 
graphs |Cre09| IIba09] . . . The fully-dynamic representation problem has also been solved for other 
families of graphs, e.g. proper interval graphs [HSS01 , using other decomposition schemes. 

Beside, Corneil et al's algorithm has been generalized to the split decomposition [CE80 to 
obtain an optimal fully dynamic algorithm for the distance hereditary graphs recognition prob- 
lem |GP07| . More recently by the same technique, Gioan et al. derived an almost linear time 
split decomposition algorithm |GPTC09a and the first subquadratic circle graph recognition algo- 
rithm |GPTC09b] . 

4 Partition refinement 

Partition refinement, as an algorithmic technique, has been used in a number of problems, the first 
of which is probably the deterministic automata minimization |Hop71| . Paigue and Tarjan [PT87 
wrote a synthesis paper on this technique. Since then, the number of problems solved by partition 
refinement keeps increasing: interval graph recog nition |HPV99| and completion |RST08] . transitive 
orientation, consecutive ones property for boolean matrices [HMPVOO are example among others. 
As we will see, this technique turns out to be a powerful and simple algorithmic paradigm that 
plays an important role in the context of modular decomposition. 

We first present the data-structure and the elementary operation, namely the refine operation, 
of the partition refinement technique. Then, we illustrate this technique with an algorithm that 
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computes a modular partition of a graph. Let us mention that this algorithm really follows the 
lines of Hopcroft's deterministic automaton minimization algorithm |H op71| . 

4.1 Data-structures and algorithmic scheme 

Let V and V 1 be two partitions of the same set V . The partition V is smaller than V', denoted 
V < V', if V ^ V' and any part of V is a subset of some part of V' . The partition V is stable with 
respect to a set S if none of the parts of V overlaps S. 

Partition refinement consists of repeating, as long as needed, the operation described in Al- 
gorithm [TJ The initial partition and the sequence of pivot sets used in the successive refinement 
steps have a large impact on the whole complexity of the algorithm. Partitioning the vertex set 
of a graph with respect to the neighbourhood of some vertex is a common operation in graph 
algorithms. Indeed in our examples, all pivot sets considered correspond to the neighbourhood of 
some vertex. 



Algorithm 1: Refine (V ', S) 
Input: A partition V of a set V and a subset S <^V, called pivot set 
Output: The coarsest partition refining V and stable for S 
begin 

foreach part X € V do 
LifXnS^ttandXnS^X then replace X by X n S and X \ S; 

end 



Let us briefly describe a very useful data-structure, namely the standard partition data structure 
(see Figure [8]). The elements of the set V to be partitioned are stored in a doubly linked list. Each 
element of V is assigned a pointer towards the part it belongs to. The elements of a part X remains 
consecutive in the doubly linked list (they form an interval) . So that each part maintains a pointer 
towards its first and its last element in the list. 
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Figure 8: V =Refine(V,S). 



Notation 8 The data- structure implicitly represents an ordered partition: the parts are totally 
ordered. Depending of the application, this aspect may or may not be important. In order to 
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distinguish the two different cases, an ordered partition will be denoted byV = [X±, . . . ,X^\ while a 
non-ordered partition will be denoted by V = {Xi, ■ ■ ■ , X^\. 

Given a subset S C V, using this standard partition data structure, one can build a list L 
containing the parts of V intersecting S, such that in each of these parts the elements of S occur 
first. Then using L, one can split every part into X f~) S and X \ S. A careful complexity analysis 
shows the following result: 

Lemma 10 The time complexity of the operation Refine('P, S) is 0(\S\). 

We conclude this brief introduction by a few remarks. Refining a partition by a subset S or 
its complement S = V \ S are equivalent operations: Refine (X , S)= Refine(X n S, V \ S). It 
is thereby possible to deal with the complement of the input graph without explicitly storing its 
edge set. Partition refinement is usually used either to compute a total ordering of the vertices 
(e.g. LexBFS) or the equivalence classes of some equivalence relations (e.g. maximal set of twin 
vertices). McConnell and Spinrad [MSOO showed how to augment the data-structure in order 
to extract within the same complexity, at each refinement step, the edges incident to vertices 
belonging to different parts. This operation is useful to efficiently compute the quotient graph 
associated to a modular partition. For a more detailed presentation of partition refinement refer 
to [ PT871 IHPV981 IHPV991 iHMPVOOj . 

Of course many variations of the standard partition data structure have been introduced, as 
for example changing the doubly linked list into an array of size |V|. A further requirement can 
be that the elements of every part X of V are maintained sorted according to a given an initial 
ordering r of V. This can be done within the same complexity and is very useful for example when 
dealing with LexBFS multi-sweep algorithms. The ordering given by some previous LexBFS can 
be used as a tie-break rule for another LexBFS |Cor04bl ICor04a| IBT^FTPM] . 

4.2 Hopcroft's rule and computation of a modular partition 

Partition refinement is the right tool to compute a modular partition, an important subproblem 
towards efficient modular decomposition algorithms. In this section, we focus on the problem 
of computing the coarsest modular partition (see Definition [8| of a given vertex partition. The 
algorithm we present runs in time 0(n + m log n) and is based on the Hopcroft's rule which is used 
in various simple quasi- linear time modular decomposition algorithms. 

Definition 8 Let V be a partition of the vertices of a graph G = (V,E). The coarsest modular 
partition of G with respect to V is the largest modular partition Q such that Q<V. 

The main idea of the algorithm is the following: as long as there is a part X which is not uniform 
for some vertex x ^ X, the current partition V is refined with the neighbourhood N(x). When the 
algorithm ends, all the parts are modules. Finding, at each step, a vertex x whose neighbourhood 
strictly refines the partition V, is the usual barrier to linear time complexity. However, using the 
so-called Hopcroft's rule, one get a fairly simple solution that uses the neighbourhood of each vertex 
at most logn times. 

Lemma 11 Let V be a partition of the vertices of a graph G = (V,E) and x be a vertex of some 
part X. If V is stable with respect to N(y), \/y ^ X, then X is a module of G and the partition 
Q = Refine(V,N(x)) is stable with respect to N(x'), W € X. 
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The above lemma (which is a direct consequence of the definition of module) shows that using 
as pivots the vertices of all the parts of V but one, say plus one vertex z of Z is enough. For 
complexity issues, the avoided part Z has to be chosen as the largest part of V . Similarly, once 
a part X has been split, the process continues recursively on the subgraph induced by X and the 
resulting largest subpart can be avoided (meaning that only one of its vertices has to be used as 
pivot). This "avoid the largest part" technique is known as the Hopcroft's rule and has been first 
proposed in the deterministic automata minimization algorithm |Ho p71| . 



Algorithm 2: Modular Partition 



Input: A partition V of the vertex set V of a graph G 
Output: The coarsest modular partition Q smaller than V 
begin 

Let Z be the largest part of V\ 
Q^V; K <- {Z}; L «- {X \ X ^ Z, X G V}; 
while L U K ± do 

if there exists X 6 L then S <— X and L <— L \ {X}; 
else 

Let X be the first part K and x arbitrarily selected in X; 
S <- {x} and K <- K\ {X}; 

foreach vertex x G S do 

foreach part y ^ X such that N(x) _L y do 

Replace in Q, y by y y = y n iV(x) and ^ 2 = ? \ JV(a;); 

Let y m in (resp. J^maa;) be the smallest part (resp. largest) among J^i and 3^2 5 
if y G L then L^LU {^ mi „, ^ mox } \ {^}; 
else 

L ^lU{y mm }; 

if y G iif then Replace y by 3^ ma2: in K; 
else Add y max at the end of K; 



end 



To implement this rule, the parts are stored in two disjoint lists K and L. The neighbourhoods 
of all the vertices of parts belonging to L will be used to refine the partition. For the parts belonging 
to K, only the neighbourhood of one arbitrarily selected vertex is used. Since K is managed with 
a FIFO priority rule, this guarantees that the first part of the list, when extracted, is a module. 

Theorem 9 Let V be a partition of the vertices of a graph G = (V, E). Algorithm^ computes the 
coarsest modular partition for G and V in time 0(n + mlogn). 

The correctness of the algorithm follows from the next three invariant properties. The first 
invariant shows that a module contains in some part of the given partition cannot be split, while 
the third one guarantees that the algorithm outputs a modular partition. 

1. If M is a module of G contained in a part X £ V, then there exists a part y of the current 
partition containing M. 

2. If L = 0, then the first part y of K is a module. 
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3. // the current partition contains a part X that is not a module, then there exists y £ L U K 
different from X and containing a splitter y for X . 

Complexity issues: The main while loop (line [2]), manages a set S of vertices whose neighbourhoods 
have to be used to refine the current partition. The set S is computed from the lists L and K. 
Since the current part containing a given vertex can be added to L, only if its size is smaller than 
half of the size of the former part containing x, the neighbourhood of each vertex x is guaranteed 
to be visited at most log(|V|) times by the algorithm. Furthermore, when a vertex x of a part X 
extracted from K is used, neither x nor none of the vertices of X is used again. This yields to a 
0(Ylxev complexity, as claimed. 

4.3 Bibliographic notes 

As already mentioned, the use of partition refinement technique dates to 1971 for the determinis- 
tic automata minimization problem |Hop71| . In 1987, Paigue and Tarjan used again this technic 
to solve three different problems: functional partition, coarsest relational partition problems and 
doubly lexicographic ordering of a boolean matrix. In the late 90's, it has been used more system- 
atically in the context of modular decomposition and transitive orientation yielding 0{n + m log n) 
practical and simple algorithms (see e.g. |MS00| IHMPVOO] ) . 

5 Recursive computation of the modular decomposition tree 

In 1994, Ehrenfeucht, Gabow, McConnell and Sullivan [EGMS94 proposed a quadratic algorithm 
for the modular decomposition^] The principle of this algorithm, which we will call the skeleton 
algorithm, is the basis of a large number of the known subquadratic algorithms proposed in the 
late 90's (see e.g. |MS001 IDGM01] ). which could abusively be considered as a series of different 
implementations of the skeleton algorithm. The complexity of these implementations are respec- 
tively 0{n + m.a{n,m)) or 0{n + m) [DGMOlj, and finally 0{n + mlogn) [MSOQ]. We describe 
the principle of the skeleton algorithm without considering the complexity issues. We then discuss 
the differences in the time complexity of the known algorithms. 

5.1 The skeleton algorithm 

Let us first mention that the skeleton algorithm computes a non-reduced form of the modular 
decomposition tree MD{G): the resulting tree may contain some series (or parallel) node child of 
a series (or parallel) node. All the algorithms we describe in this section will do so. It does not 
impact the complexity issues as a single search of the tree is enough reduce it in time 0{n). In 
the following, we will abusively denote MD{G) the (non-reduced) decomposition tree returned by 
these algorithms. 

The main idea developed by Ehrenfeucht et al. [EGMS94 j is to first compute a "spine" of the 
modular decomposition tree MD{G), then to recursively compute the modular decomposition trees 
of some induced subgraphs which are eventually padded to the spine. More formally: 

x This algorithm is designed for 2-structures, a classical generalization of graphs. 
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Definition 9 Let v be an arbitrary vertex of a graph G = (V,E). The v- modular partition is the 
following modular partition: 

M(G, v) = {v} U {M | M is a maximal module not containing v} 
We define spine(G,v) as the modular decomposition tree MD(G /m(G,v)) ■ 

First we notice that M.(G, v) is easy to compute. 

Lemma 12 The partition M(G,v) is the coarsest modular partition for G and V = {N(v),v, N(v)} 
and can be computed in time 0{n + mlogn). 




Figure 9: On the left, a modular decomposition tree MD{G) and on the right, the modular 
partition M{G,v) with the corresponding spine between v and the root of MD{G). 



Algorithm 3: Ehrenfeucht et al. [EGMS94 

Input: An arbitrary vertex v of G = (V,E), T = spine(G,v) and {T x = MD{G[X\) \ X i M(G,v)} 

Output: The modular decomposition tree MD(G) 

begin 

foreach leaf X of T do 

Let T x = MD(G[X}) and p(X) be X's father in T; 
Replace X by T x in T; 
1 if the root r(Tx) and p(X) are both parallel or series then 

\_ Remove r(Tx) and connect the children of r(Tx) to p(X) 

end 



Let us notice that any degenerate strong module (series or parallel) containing v will be repre- 
sented in spine(G, v) by a binary node. The purpose of test of Line [3] in Algorithm [3] is to correctly 
fixed those binary nodes. The correctness of Algorithm [3] is a consequence of the following proper- 
ties: 

Lemma 13 IEGMS941 Let v be a vertex of a graph G = (V, E) and Ai(G,v) be the associated 
modular partition. Then: 

1. Any non-trivial module of G /m(G,v) contains v; 

2. A set X C M{G,v) is a non-trivial strong module of G /m(G,v) "^Jj^Jm&x^ ^ s an ancestor of 
v in MD(G); 
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3. Any module not containing v is a subset of a part M € A4(G,v). 



Computing spine(G, v) is a the difficult and technical task of the skeleton algorithm, indeed it is 
its main complexity bottleneck. The solution we present hereafter has been proposed in [EGMS94 
and yields quadratic running time. Later on, Dahlhaus et al. [DGM01 improved this step and 



obtained a subquadratic running time (see discussion of Section 5.3). 



5.2 Computation of spine(G, v). 

Definition 10 A graph G = (V, E) is nested if there exists a vertex v 6 V which is contained in 
all the non-trivial modules of G. Such a vertex is called an inner vertex of G. 



As a direct consequence of Lemma 13 the quotient graph G/j^iq^ is a nested graph with inner 
vertex v. 



In order to compute the modules of G/j^(q^\ and spine(G,v), Ehrenfeucht et al. [EGMS94 
introduced an auxiliary forcing digraph the arc set of which guarantees the existence of a directed 
path from any vertex u to any vertex w £ m(u, v), the smallest module containing u and v. As v 
belongs to all the modules of G/ M ^ G v ^, a simple search on the forcing graph will suffice to compute 
spine(G, v). 



Definition 11 Let v be an arbitrary vertex of a graph G = ( V, E) . The forcing graph ^(G, v) 
a directed graph whose vertex set is V \ {v}. The arc xy exists if y is a splitter for {x,v}. 

In other words, if xy exists then y belongs to any module containing v and x. 



Is 




Figure 10: A nested graph G = (V, E) together with its modular decomposition tree MD{G) 
and on its right the forcing graph T{G,v). The strongly connected components of J-(G,v) are 
{1}, {2, 3, 4}, {5}. Any module of G containing 3 and v also contains {1,2, 4}, the vertices that can 
be reached from vertex 3 in ^(G, v). 



Lemma 14 ' t EGMS94l If X is the set of vertices that can be reached from vertex x in the forcing 
graph J 7 (G,v), then {v } U X = m(v, x). 

In the following we will only consider the graph G /m(G,v) anci its forcing graph F(G i_m(q v ^,v). 
Applying Lemma 14 to F{G /m^q^, v), we obtain the following property. 



2 The definition proposed here slightly differs from the original one of EGM S94 . This modification simplifies the 
relationships with the results of [DGMOl] , 
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Corollary 1 [EGMS94] Let M x be the module of Ai(G,v) containing the vertex x. If X is the set 
of modules that can be reached from M x in J-(G /m(G,v)i v ) > then Ua/gA" M = m(v,x). 

We now consider the block graph B(G,v) of ^(G/m(g,u),v) (see [CLR90J) whose vertices are 
the strongly connected components of ^(G /j^(q jV \, v), also called the blocks of (G,v). An arc of 
B(G,v) between the block B and B' exists if the vertices of B' can be reached in J-(G /m(G,v)i v ) 
from the vertices of B. 

Lemma 15 [EGMS94I The transitive reduction of the block graph B(G,v) is a chain. 



A set of vertices of a digraph is a sink if it has no out-neighbour. By Lemma 15 , any sink set of 
J-(M{G,v)) is the union of consecutive blocks containing the last one in the transitive reduction 
of B(G,v). Each sink set corresponds to a module of G/j^^Q^ v y 

Corollary 2 [EGMS94] Let v be a vertex of a graph G = (V, E). A set M of vertices containing v 
is a module of G /m(G,v) iff M is the union of {v} and the modules of M{G, v) belonging to a sink 
setXofB{G,v). 

Thereby the forcing graph ^(Gy.M(G>)> v) describes the modules of G/j^fQ )V \ and the block 
graph B(G, v) allows us to compute spine(G, v). Finally, MD(G) is obtained recursively by follow- 



ing the lines of Lemma 13 



5.3 Complexity issues 

Rather than detailing the complexity analysis, we point out the differences between the original 
skeleton algorithm presented in [EGMS94J and its later versions improved in [DGM01 . The inter- 
ested reader should access the original papers for details. As already mentioned, a quadratic time 
complexity analysis is proposed in [EGMS94 . The main bottlenecks are the computation of the 
partition Ai(G,v) and the construction of MD(G ij^fQ V \). 

Two new versions of the skeleton algorithm proposed by Dahlhaus, Gustedt and McConncll [DGM01 , 
respectively run in 0(n + m.a(n,m)) time and in linear time. To improve the time complexity, 
the authors of [DGM01 borrowed from |Dah95] the idea to first recursively compute the modular 
decomposition trees of the subgraphs induced by N(v) and by N(v). It follows from the next 
Lemma, that A4(G,v) is easy to retrieve from those trees. 

Lemma 16 If X is a module of M{G,v), then X is either a module of G[N(v)] or a module of 
G[N(v)). 

As in [EGMS94 , the technique used to compute spine(G, v) relies on a forcing digraph. Remind 
that the vertices of J-(J\4(G, v)) are the modules of G (indeed the modules of Ai(G, v)) which turns 
out to be a too strong condition for time complexity issues. In [DGMOlJ, the forcing digraph is 
rather defined with the help of an equivalence relation. The idea is that each equivalence class 
gathers vertices of N(v) or of N(v) which appear in a set of sibling modules of some ancestor 
node of v in MD{G) (or spine(G,v)). The partition defined by the equivalence classes is a coarser 
partition than A4(G,v). 

The final trick is that given MD(G[N(v)]) and MD(G\N(v)]), the computation of M(G,v), 
spine(G, v) and finally MD{G) has to be done in time linear in the number of active edges, i.e. the 
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edges incident to v and the edges linking vertices of N{v) and N{v). The a(n,m) factor in the 
first version of the skeleton algorithm presented in [DGMOlJ is due to the use of some union-find 
data-structures required to update the current tree. A clever time complexity analysis yields linear 
time if a careful pre-processing step is used to fix the recursion tree. 



5.4 Bibliographic notes 

Let us mention that the problem of finding a simple linear time algorithm for the modular decom- 
position is presented in [MSOO] or |Spi03 as an open problem. In its book |Spi03| , Spinrad wrote 
p.149: 

" / hope and believe that in a number of years the linear algorithm can be simplified as 
well" 

Based on partition refinement techniques, a simplified 0{n + mlogn) version of the skeleton 
algorithm has been developed in [MSOO . 



6 Factoring permutation algorithm 

In its PhD Thesis, Capelle |Cap97) proved that computing the modular decomposition tree of a 
graph and computing a factoring permutation (see Definition|4 and Figures [3{|12[) are two equivalent 
tasks, as one can be retrieved from each another in linear time CHdM02]. It follows that computing 
the modular decomposition of a graph can be divided into two different steps: 1) computation of 
a factoring permutation; 2) computation of the modular decomposition tree given the factoring 
permutation. The main interest of such a strategy is to obtain an algorithm that avoids the auxiliary 
data-structures needed to compute union-find and least common ancestor operations, as used 
in |DGM0l] for example. Moreover, in some recent applications {e.g. comparative genomics |UY00[ 
BHS02, HMS09J), the given data is not the graph nor the partitive family but rather a factoring 
permutation. This concept turns out to be of interest by itself. 

As noticed by Capelle |Cap97| , this strategy was already used in few cases such as the compu- 
tation of the modular decomposition tree of chordal graph [HM91 and the block tree of inheritance 
graphs [HH S95] . In |HPV981 IHPV99] , a partition refinement algorithm is proposed to compute a 
factoring permutation of a graph in time 0{n + mlogn). Restricted to cographs, the complexity 
can be improved down to linear time [HP05J. 

We will first revisit Algorithm [T] of [HPV98 and show how it can be adapted to compute a 
factoring permutation in time 0{n + mlogn). This algorithm has to be compared to the McConnell 
and Spinrad's implementation [MS00 a of Ehrenfeucht et al.'s algorithm. The main differences are 
that the modular decomposition tree is never built and the relative order between the different 
parts of the partition is important. 

There exist several linear time algorithms that given a factoring permutation of a graph compute 
its modular decomposition tree. A recent one is proposed in |BCdMR05, BCdMROS]. We describe 
the principle of the first one due to Capelle, Habib and de Montgolfier [CHdM02 . 



6.1 Computing a factoring permutation 

An ordered partition V = [Xi, . . . , X^j of a set 8 defines a partial order on £, the maximal antichains 
of which are exactly the parts of V . In other words, we have Xi <-p Xj iff Xi G Xi, Xj G Xj and 
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i < j. Thereby refining an ordered partition could be understood as computing an extension of the 
corresponding partial order. 

We will abusively write x <j> M, for x £ £ and M C £, if x <-p y for all y £ M. To prove 
the correctness of the algorithm, we need to generalize the definition of interval of permutations to 
ordered partitions. 

Definition 12 Let V be an ordered partition of a set £. A subset S C £ is an interval of V iff 
there are two parts C £ V and 1Z £ V (not necessarily distinct) intersecting S such that for any 
part X: 

• if £ <-p X <-p 7Z, then X C S; 

• if X < v C orTZ< v X, then X n S = 0. 

To compute a factoring permutation, the main steps of the algorithm we present are: 1) com- 
putation of an ordered partition that is a modular partition M(G, v) such that the strong modules 
containing a vertex v are intervals of Ai(G, v)\ and 2) recursive computation of a factoring permu- 
tation of each of the subgraphs induced by a module M £ M.[G, v). 

Algorithm 4: Factoring-permutation (G , v) 
Input: A graph G = (V, E) and a vertex v £ V 
Output: A factoring permutation of G 
begin 

Let V = [N(v), {v},N(v)j be an ordered partition; 
Apply Algorithm [2] with the following refinement rule; 
Let x be the current pivot vertex and y a part such that N(x) _L y; 
if x v y or y ^-p v ^-p x then 
| Substitute yby[yn N{x) , ^ n N(x)]; 
else 

L Substitute y by [y n N{x),y n N(x)]; 

foreach part X £ Ai(G,v), such that \X\ > 1 do 
Let x be the last vertex of X used as pivot; 
Vx Factoring-permutation(G[A'], x); 
\_ Substitute X by Vx', 

end 



Theorem 10 Algorithm^ compute in time 0(n + mlogn) a factoring permutation of a graph 
G=(V,E). 



Proof: Using lemma 12 M(G, v) can be computed in 0(n + mlogn). By Lemma 13 any module 
not containing v is a subset of some module of A4(G,v). It thereby suffices to prove that the 
following invariant is satisfied by Algorithm [4] (see Figure 11): 



LT = any strong module containing v is an interval of the current partition 

The property LT is obviously satisfied by the initial partition [N(v), {v}, N(v)]. Assume by induction 
IT holds before the current partition V is refined by N(x) for some vertex x. Let M be a module 
containing v and X be a part of V such that X _L N{x). There are two distinct cases: 
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Figure 11: Layout of the modular decomposition tree MD(G) such that the neighbours of v are 
placed on the right of v and the non-neighbours on the left. The right tree enlights the modules 
of M{G,v) and the strong modules Mi, M2, M3 and M4 containing v. Algorithm [4] first computes 
the partition M.(G, v) and then recursively solves the problem on each module of M{G, v) 



• x ^ M: no vertex y of X D N(x) belongs to M, otherwise x would be a splitter for v and y; 

• x € M: if ^ C N(v), then any vertex 1/ £ ATI iV(x) belong to M, otherwise y would be a 
splitter for x and t> . Similarly if X C N(v), then any vertex y € X D N(x) belongs to M. 

It follows that V' =Refine(T ) , N(x)) also satisfies the invariant II. The complexity analysis is similar 
to the analysis of Algorithm [2] □ 



6.2 The case of cographs 

The natural question is how to get rid of the log n factor in the complexity of Algorithm [4j Re- 



stricting the problem to cographs (or totally decomposable graphs - see Section 2.5) gives some 
ideas. The reader should keep in mind that the log n factor corresponds to the number of times the 
neighbourhood of a vertex can be used to refine the partition. So, a linear time algorithm should 
use each vertex as a pivot a constant number of times. 

The linear time cograph recognition algorithm proposed in [HP05 computes a factoring per- 
mutation as a preliminary step. It roughly proceeds as follows. It uses at most one vertex per 
partition part to refine the ordered partition [N(v), {v}, N(v)]. Assuming the input graph is a 
cograph, when none of the parts of the current partition is free of pivot, it can be proved that 
one of the two no n- singleton parts closest to v in the current partition, say X, can be refined into 
[N(x) n X, {x}, N(x) n X] (x being the used pivot of X). This step creates at least one new part 
free of pivot and thereby relaunches the refining process. 



6.3 From factoring permutation to modular decomposition tree 

As already noticed, a natural idea to compute the modular decomposition tree is to compute for 
each pair x,y of vertices the set of splitter S(x,y). Unfortunately a linear time algorithm could 
not afford the computation of all these 0(n 2 ) sets. But if one has in hand a factoring permutation 
cr, it is then sufficient to consider the pairs of consecutive vertices in a. Indeed, Capelle et al.'s 
algorithm |CHdM02 only computes for each pair of vertices x = a{i) and y = a(i + l) {i £ [1, n — 1]) 
the leftmost and the rightmost (in cr) splitter of x and y. These two splitters define two intervals 
of a, which are both contained in m(x,y), the smallest module containing both x and y: 
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• the left fracture Fi(x,y) = [z,x] if z is the leftmost splitter of {x,y} in [<r(l),y] (if any); 

• the right fracture Fd(x, y) = [y, z] if z is the rightmost splitter of {x, y} in [x, cr(n)] (if any). 




Figure 12: A graph G = (V, E) for which <r = 123456789 10 11isa factoring permutation 
(see Definition]^. The right fracture of (3,4) does not exist but Fg(3,4) = [2,3]. We also have 
Fd(l,2) = [2,7]=Fg(7,8). 



The set of fractures (left and right) defines a parenthesis system. Forgetting the initial pairing 
of the parenthesis, this system naturally yields a tree, called the fracture tree and denoted FT{G) 
(see Figure 12). The fracture tree is actually a good estimation of the MD{G) (see Lemma [XT]) 
which can be computed in linear time by two traversals of a: the first traversal computes the 
fractures, the second builds the tree. 



Lemma 17 [CHdM02] Let a be a factoring permutation of a graph G and M be a strong module 
of G. If M is a prime node of MD{G) and if the father of M is a degenerate, then there exists a 
node N of the fracture tree FT{G) such that M is the set of leave of the subtree of FT{G) rooted 
at N 



For example, in Figure 12 any strong module but M = {8,9, 10, 11} is represented by some 
node of FT{G). Let us notice that the above lemma does not implies that the strong module 
{2,3,4} has a corresponding node in FT(G). 

Henceforth to compute MD(G), the fracture tree FT{G) has to be cleaned. To that aim, Capelle 
et al. |CHdM02] use four extra traversals of the factoring permutation. The first one identifies the 
strong modules represented by some nodes of FT{G); the second finds the dummy nodes of FT{G)\ 
the third search for strong modules that are merged in a single node of FT{G)\ and the last one 
remove the nodes of FT{G) that does not represesent strong modules. The complexity of each of 
these four traversals is linear in the size of G, 0(n + m). 



6.4 Bibliographic notes 

An attempt to generalize to arbitrary graphs the linear time algorithm which computes a fac- 
toring permutation of a cograph has been proposed in [HdMP04 . Unfortunately the algorithm 
of [HdMP04] contains a flaw. The recent linear time modular decomposition algorithm presented 
in [TCHP08J mixes the ideas from the factoring permutation algorithms and the skeleton algorithm. 
It generalizes the ordered partition refining technique to tree partition and avoids union-find or 
least-common ancestor data-structures. In that sense this new algorithm may be considered as a 
positive answer to Spinrad's comment (see Section 5.4). 
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7 Three novel applications of the modular decomposition 



As mentioned in the introduction modular decomposition is used in a number of algorithmic graph 
theory applications and more generally applies to various discrete structures (sec [MR84J. We 
conclude this survey with the presentation of three novel applications which are good witnesses of 
the use of modular decomposition. The first one is a pattern matching problem which is closely 
related to the concept of factoring permutations. The second one provides an example of dynamic 
programming on the modular decomposition tree in the context of comparative genomic. Finally, 
we list a series of parameterized problems for which module based data-reduction rules leads to 
polynomial size kernels. 



7.1 Pattern matching - common intervals of two permutations 

Motivated by a series of genetic algorithms for sequencing problems, e.g. the TSP, Uno and 
Yagiura [UYOO formalized the concept of common interval of two permutations. As we will see 
in the next subsection, in the context of comparative genonic, common intervals reveal conserved 
structures in chromosomal material. 



Definition 13 A set S of elements is a common interval of a set of permutations S if in each 



permutation a £ E, the elements of S form an interval of a (see Section 2.2 for the definition of 
an interval). 

It is fairly easy to observe that the family X of common intervals of two permutations is a weakly 
partitive family (see Definition [TJ and thus all the results from the theory presented in Section 2.1 



apply. In particular, the set of strong common intervals are organized into a tree, namely the strong 
interval tree. 



c 



1 5 10 11 9 

3 4 5 6 7 



1 c 



2 3 
10 11 



Figure 13: The strong interval tree of two permutations. Remark {9, 10, 11} and {8,9} are also a 
common interval, but they are not strong as they overlap. 

Despite of the existence of the (weakly) partitive set theory for more that thirty years, the nat- 
ural concept of interval substitution and decomposition appeared only very recently in the context 
of the combinatorial study of permutations (see e.g. [AS02, AA05J. Atkinson and Stitt |AS02 j 
(re) discovered the concept of substitution under the name of wreath product. In 2005, Albert and 
Atkinson showed that, if the number of simple (i.e. prime) permutations in a pattern restricted 
class of permutations is finite, the class has an algebraic generating function and is defined by 
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a finite set of restrictions. More recently, Bouvel, Rossin and Viallette [BR061 1BRV07] used the 
strong interval tree to solve the longest common pattern problem between two permutations. 

Uno and Yagiura [UYOO proposed the first linear time algorithm to enumerate the common 
intervals of two permutations. More precisely, it runs in 0{n + K) time, where K is the number of 
those common intervals (which is possibly quadratic). Alternative algorithms have been recently 
proposed [HMS09, BCdMR08]. We sketch Uno and Yagiura's algorithm and discuss how it can be 
genralized to compute the modules of a graph when a factoring permutation is given. 

Without loss of generality, we will consider the problem of computing the common intervals of a 
permutation a and the identity permutation I n . To identify the common intervals of a permutation 
a and I n , the algorithm traverses a only once. We denote by the interval of a composed 
by the elements whose indexes are between % and j in a: i.e. = {x \ i ^ o~(x) j}- An 

element x ^ is a splitter of the interval if there exist y G and z G [i, j] such that 
y < x < z. By s([i, j}) we denote the number of splitters of the interval The algorithm uses 

a list Potentiel to filter and extract a the common intervals of a and I n . An element i belongs 
to the list Potentiel as long as it may be the right boundary of a common interval. The step i 
consists in removing those elements which we know they cannot be the left boundary of a common 



containing. This filtering can be done efficiently by computing s([i, j]) (see Lemmas 18 and 19) 



Algorithm 5: Uno and Yagiura's algorithm jUYOO] 



Input: A permutation a 

Output: The set of intervals common to a and the identity permutation I n 
begin 

Let Potentiel be an empty list; 
for i = n downto 1 do 

(Filter) Remove from Potentiel the boundaries r s.t. Vj ^ i, [j,r] is not a common 
interval of a and I n ; 

(Extraction) Search Potentiel to find the boundaries r s.t. [i,r] is a common 
interval of a and I n and output those intervals [i,r]; 
(Addition) Add i to Potentiel; 



end 



The following properties are fundamental in the correctness of the algorithm: 

Lemma 18 JUYOOjl An interval of a is a common interval of a and I n iff s([i,j]) = 0. 

Lemma 19 }UY0(A BXHP0$$ If s([i,j}) > s([i,j + 1]), then it does not exist r < i such that [r,j] 
is a common interval of a and I n . 

The second lemma above means that if s([i, j]) > s([i,j + 1]) then the vertex cr~ 1 (j + 1) is a 
splitter of Thereby any common interval containing [i,j] as a subset has to extend up to 

" •;./ • n. 



Application to factoring permutations of a graph. 



The most striking link between common 

Permuta- 



intervals and modules of graphs is observed on permutation graphs (see Lemma 20) 
tion graphs are defined as the intersection graphs of a set of segments between two parallel lines 
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(see [Gol80l IBTS99] for example). It follows that the vertices of a permutation graph G = (V, E) 
can be numbered from 1 to n such that there exists a permutation a of [l,n] such that vertex 
numbered i is adjacent to vertex numbered j iff i < j and a(j) < o~(i). The permutations a and I n 
form the realizer of G. As first observed by de Montgolfier, any permutation belonging to a realizer 
of a permutation graph is a factorizing permutation of that graph. It follows from Lemma [T] that: 

Lemma 20 [dM03] Let G = (V,E) be a permutation graph and (I n ,o~) be its realizer. A set of 
vertices M is a strong module iff M is a strong common interval ofI n and a. 

The permutation graph corresponding to the permutations depicted in Figure [13] is the graph 
G of Figure [3} Notice that the strong interval tree of these two permutations is isomorphic to the 
modular decomposition tree of G. 

It follows from Lemma 20 that applied to the realizer of a permutation graph, Algorithm [5] 



computes its strong modules. Though some extra work is required to obtained the modular de- 
composition tree, the complexity remains linear time. Moreover, as shown in [BXHP05], Uno and 
Yagiura's algorithm can directly be adapted to compute, given a factoring permutation, the strong 
modules of a graph. The number s([i, j]) becomes the number of splitters (in the sense of the mod- 
ular decomposition, see Section [2.3| ) of the vertices contained in the interval of the factoring 
permutation. Now notice that Algorithm [5] does not only output the strong common intervals. 
In order to restrict the enumeration to strong modules, a slight modification is required. A first 
traversal computes the strong right modules (i.e. the modules that are intervals of a and which are 
not overlapped on their right boundary by any other module). Then a second traversal can detect 
those modules which are overlapped on the left boundary. 



7.2 Comparative genomic - perfect sorting by reversals 

A reversal in a permutation a consists in reversing the order of the elements of an interval of a. 
When dealing with signed permutations (whose elements are positive or negative), a reversal also 
flips the sign of the element of the reserved interval. Given two (signed) permutations a and r, the 
problem of sorting by reversals asks for a series of reversals (a scenario) to transform a into r. 

Sorting by reversals is used in comparative genomic to measure the evolutionary distance be- 
tween the genomes of two chromosomes, modeled as signed permutations [BHS02 j. When comparing 
two genomic sequences, it can be assumed that the intervals having the same gene content are likely 
to have been present in their common ancestor and may witness to some functionally interacting 
proteins. Such a conserved genomic structure in the signed permutation model corresponds to com- 
mon intervals. So to guess an evolutionary scenario between two genomic sequences represented 
by signed permutations a and r, one could asks for the smallest perfect scenario, which is a series 
of reversals that preserves any common interval of a and r. For further details on this topic, the 
reader could refer to jBHS021 IBBCP04] . 

As mentioned in the previous subsection, the set of common intervals of two permutations 
(signed or not) defines a weakly partitive family. It follows that one can distinguish prime from 
degenerate strong common intervals. As shown by the following lemma, we can read on the strong 
interval tree which are the perfect scenarios. 

Lemma 21 \BBCP01$ A reversal scenario for two signed permutations a and r is perfect iff any 
reversed interval is either a prime common interval of a and r, or the union of strong common 
intervals which form a subset of the children of a prime common interval. 
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Figure 14: A perfect scenario of length 7. 



It follows from the previous lemma that the strong interval tree is useful to compute minimum 
perfect scenarios. Indeed with some extra technical properties to deal with the signs it can be shown 
that a simple dynamic programming algorithm on the strong interval tree solves the problem in 
time 0(2 k x n\/nlogn), where k is the maximum number of prime nodes which are children of the 
same prime node. In practice, the parameter k keeps very small [BCP08J: e.g. when comparing 
the chromosome X of the mouse and the rat, we have k = [BBCP07J. 



7.3 Parameterized complexity and kernel reductions - cluster editing 

The design of parameterized algorithms is, among others, one of the modern techniques to cope 
with NP-hard problems. A problem II is fixed parameter tractable (FPT) with respect to parameter 
k if it can be solved in time f{k).n 0<y1 ^ where n is the input size. The idea behind parameterized 
algorithms is to find a parameter k, as small as possible, which controls the combinatorial explosion. 
Many algorithm techniques have been developed in the context of fixed parameter complexity, 
among which kernelization. A parameterized problem (II, k) admits a polynomial kernel if there 
is a polynomial time algorithm (a set of reduction rules) that reduces the input instance to an 
instance whose size is bounded by a polynomial p(k) depending only in k, while preserving the 
output. The classical example of parameterized problem having a polynomial kernel is the problem 
vertex cover parameterized by k the solution size, which has a 2k vertex kernel. For textbooks 
on this topics, the reader should refer to |DF99, Nie06[ lFG06j. 

Recently, the modular decomposition appeared in kernalization algorithms for a series of param- 
eterized problems among which: CLUSTER editing |Nie06] . bicluster editing |PdSS07| . fast 
(feedback arc set in tournament) |DGH + 06] . CLOSEST 3-leaf power |BPP09| . flip consensus 
tree [BBT 08] . We discuss the cluster editing problem. Concerning the others, the reader 
should refer to the original papers. 

The parameterized cluster editing problem asks whether the edge set of an input graph G 
can be modified by at most k modifications (deletions or insertions) such that the resulting graph 
H is a disjoint union of cliques (e.g. clusters). This problem is NP-complete but can be solved in 
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time 0*(3 k ) by a simple bounded search tree algorithm |Cai96l , which iteratively branches on at 
most k Ps's. Recent papers |Guo07[ FLRS07 j showed the existence of a linear kernel (best bound 
is 4k). The reduction rules used for these linear kernels are crown rules involving modules. For the 
sake of simplicity we only present the two basic reduction rules which leads to a quadratic vertex 
kernel. 

Lemma 22 Let G = (V, E) be a graph. A quadratic vertex kernel for the cluster editing 
problem is obtained by the following reduction rules: 

1. Remove from G the connected components which are cliques. 

2. If G contains a clique module C of size at least k + 1, then remove from \C\ — k — 1 vertex 
from C. 

It is clear that these rules can be applied in linear time using modular decomposition algorithms. 
The proof idea works as follows. The first rule is obviously safe. Concerning the second rule, simply 
observe that to disconnect a clique module of size k+1 from the rest of the graph, at least k + 1 edge 
deletions are required. Now assuming G is a positive instance, each cluster of the resulting graph 
H can be bipartitioned into the vertices non-incident to a modified edge and the other vertices (the 
affected vertices). Finally k edge modifications can create at most 2k clusters and the total number 
of affected vertices is bounded by 2k. This shows that the number of vertices in the reduced graph 
H is at most 2k 2 + 4k. 

The bicluster editing problem edits the edge set of a graph to obtain a disjoint union of 
complete bipartite graphs. Instead of considering clique modules, we need to consider independent 
set modules [PdSS07]. The proof is then slightly more complicated and relies on a careful analysis 
of the modification of the modular decomposition under edge insertion or deletion. In the case 
of FAST, similar rules involving transitive modules also yields a quadratic kernel bound. Note 
that for these two problems, linear kernels can be obtained with more sophisticated reduction 
rules jGHKZ08l lBFG+09] . 

8 Conclusions and perspectives 

An important remaining open problem is the proposal of a simple linear time certifying algorithm 
for modular decomposition. In fact the algorithms described here produce a labelled tree that can 
be checked in linear time if they are decomposition trees. But for certifying that some decompo- 
sition tree is the modular decomposition one must certify all node labels. The bottleneck is the 
certification of prime nodes. 

We have presented above the principles of a fully dynamic algorithm for modular decomposition 
of cographs, these can be also done for permutation graphs and interval graphs using their geometric 
representation ( TOG [Cre09] . Fully dynamic modular decomposition for the general case is still an 
open problem. 

For some applications one wants to extend the notion of module to some notion of approximative 
module, for which we want to extend the notion having the same behaviour outside of the module. 
Several attemps have already been considered |BXHLdM09 . The main difficulty is to find an 
interesting extension of module polynomially tractable, since many of the natural extensions yield 
to NP-complete problems [FP03J. 
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